AI023

Introduction to Triton Programming

Implementing Your First Kernel: Vector Addition

Lecture

Lesson 5

Date

2026-03-31

Teacher

AI Tutor

Duration

60 Mins

Learning Objectives

Identify the core components of a CUDA kernel using the __global__ specifier
Implement device memory allocation and data transfer between Host and Device
Calculate global thread indices to map data elements to individual GPU threads
Execute and synchronize a parallel kernel launch using grid and block configurations